Reproducible data collection
#| label: setup
#| echo: false
#| eval:true
#| message: false
library(knitr)
library(tidyverse)
library(lubridate)
Content
Design spreadsheet
Spreadsheet content
- ID (unique ID for each observation, individual)
- Date, time, observation number
- Location: region/site
- Experimental design: block, plot, replicate, number of observation, treatments
- Organism: species/population/genet
- Response
- Predictors
- METADATA: recorder/scribe, weather, notes
Design spreadsheet - data validation
Rectangular spreadsheet - good example
- rectangular.
- not have empty cells, rows or columns, titles or double headers.
![]()
Figure 1: ?(caption)
Rectangular spreadsheet - bad example
![]()
Figure 2: ?(caption)
Single value per cell
Tidy spreadsheets follow the following rules:
- each variable should be one specific column,
- each observation should be one specific row,
- each cell at the intersection of a row and a column contains a single value.
![]()
Figure 4: Wide (A) and long (B) data table.
Consistency
![]()
Figure 5: Inconsistency in species names
Meaningful names
![]()
Figure 6: Final doc by PhDcomics.com
Style
![]()
Figure 7: Different styles for naming objects. Credit: Allison Horst.
Standards
Use global data standards when available.
![]()
Figure 8: ?(caption)